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Exploring the Metabolic and Genetic Control of 
Gene Expression on a Genomic Scale 

Joseph L DeRisi, Vishwanath R. Iyer, Patrick O. Brown* 

DNA microarrays containing virtually every gene of Saccharomyces cerevisiae were used 
to carry out a comprehensive investigation of the temporal program of gene expression 
accompanying the metabolic shift from fermentation to respiration. The expression 
profiles observed for genes with known metabolic functions pointed to features of the 
metabolic reprogramming that occur during the diauxic shift, and the expression patterns 
of many previously uncharacterized genes provided clues to their possible functions. The 
same DNA microarrays were also used to identify genes whose expression was affected 
by deletion of the transcriptional co-repressor TUP1 or overexpression of the transcrip- 
tional activator YAP1. These results demonstrate the feasibility and utility of this ap- 
proach to genomewide exploration of gene expression patterns. 



The complete sequences of nearly a dozen 
microbial genomes are known, and in the 
next several years we expect to know the 
complete genome sequences of several 
metazoans, including the human genome. 
Defining the role of each gene in these 
genomes will be a formidable task, and un- 
derstanding how the genome functions as a 
whole in the complex natural history of a 
living organism presents an even greater 
challenge. 

Knowing when and where a gene is 
expressed often provides a strong clue as to 
its biological role. Conversely, the pattern 
of genes expressed in a cell can provide 
detailed information about its state. Al- 
though regulation of protein abundance in 
a cell is by no means accomplished solely 
by regulation of mRNA, virtually all dif- 
ferences in cell type or state are correlated 
with changes in the mRNA levels of many 
genes. This is fortuitous because the only 
specific reagent required to measure the 
abundance of the mRNA for a specific 
gene is a cDNA sequence. DNA microar- 
rays, consisting of thousands of individual 
gene sequences printed in a high-density 
array on a glass microscope slide (J, 2), 
provide a practical and economical tool 
for studying gene expression on a very 
large scale (3-6). 

Saccharomyces cerevisiae is an especially 
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favorable organism in which to conduct a 
systematic investigation of gene expression. 
The genes are easy to recognize in the ge- 
nome sequence, ris regulatory elements are 
generally compact and close to the tran- 
scription units, much is already known 
about its genetic regulatory mechanisms, 
and a powerful set of tools is available for its 
analysis. 

A recurring cycle in the natural history 
of yeast involves a shift from anaerobic 
(fermentation) to aerobic (respiration) me- 
tabolism. Inoculation of yeast into a medi- 
um rich in sugar is followed by rapid growth 
fueled by fermentation, with the production 
of ethanol. When the fermentable sugar is 
exhausted, the yeast cells turn to ethanol as 
a carbon source for aerobic growth. This 
switch from anaerobic growth to aerobic 
respiration upon depletion of glucose, re- 
ferred to as the diauxic shift, is correlated 
with widespread changes in the expression 
of genes involved in fundamental cellular 
processes such as carbon metabolism, pro- 
tein synthesis, and carbohydrate storage 
(7). We used DNA microarrays to charac- 
terize the changes in gene expression that 
take place during this process for nearly the 
entire genome, and to investigate the ge- 
netic circuitry that regulates and executes 
this program. 

Yeast open reading frames (ORFs) were 
amplified by the polymerase chain reaction 
(PCR), with a commercially available set of 
primer pairs (8). DNA microarrays, con- 
taining approximately 6400 distinct DNA 
sequences, were printed onto glass slides by 



using a simple robotic .printing device (9). 
Cells from an exponentially growing culture 
of yeast were inoculated into fresh medium 
and grown at 30°C for 21 hours. After an 
initial 9 hours of growth, samples were har- 
vested at seven successive 2-hour intervals, 
and mRNA was isolated (10). Fluorescently 
labeled cDNA was prepared by reverse tran- 
scription in the presence of Cy3 (green) - 
or Cy5(red)-labeled deoxyuridine triphos- 
phate (dUTP) (II) and then hybridized to 
the microarrays (12). To maximize the re- 
liability with which changes in expression 
levels could be discerned, we labeled cDNA 
prepared from cells at each successive time 
point with Cy5, then mixed it with a Cy3- 
labeled "reference" cDNA sample prepared 
from cells harvested at the first interval 
after inoculation. In this experimental de- 
sign, the relative fluorescence intensity 
measured for the Cy3 and Cy5 fluors at 
each array element provides a reliable mea- 
sure of the relative abundance of the corre- 
sponding mRNA in the two cell popula- 
tions (Fig. 1). Data from the series of seven 
samples (Fig. 2), consisting of more than 
43,000 expression- ratio measurements, 
were organized into a database to facilitate 
efficient exploration and analysis of the 
results. This database is publicly available 
on the Internet (13). 

During exponential growth in glucose- 
rich medium, the global pattern of gene 
expression was remarkably stable. Indeed, 
when gene expression patterns between the 
first two cell samples (harvested at a 2-hour 
interval) were compared, mRNA levels dif- 
fered by a factor of 2 or more for only 19 
genes (0.3%), and the largest of these dif- 
ferences was only 2.7-fold (14). However, as 
glucose was progressively depleted from the 
growth media during the course of the ex- 
periment, a marked change was seen in the 
global pattern of gene expression. mRNA 
levels for approximately 710 genes were 
induced by a factor of at least 2, and the 
mRNA levels for approximately 1030 genes 
declined by a factor of at least 2. Messenger 
RNA levels for 183 genes increased by a 
factor of at least 4, and mRNA levels for 
203 genes diminished by a factor of at least 
4- About half of these differentially ex- 
pressed genes have no currently recognized 
function and are not yet named. Indeed, 
more than 400 of the differentially ex- 
pressed genes have no apparent homology 
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co any gene whose function is known {15). 
The responses of these previously unchar- 
acterized genes to the diauxic shift therefore 
provides the first smalt clue to their possible 
roles. 

The global view of changes in expres- 
sion of genes with known functions pro- 
vides a vivid picture of the way in which 
the cell adapts to a changing environ- 
ment. Figure 3 shows a portion of the yeast 
metabolic pathways involved in carbon 
and energy metabolism. Mapping the 
changes we observed in the mRNAs en- 
coding each enzyme onto this framework 
allowed us to infer the redirection in the 
flow of metabolites through this system. 
We observed large inductions of the genes 
coding for the enzymes aldehyde dehydro- 
genase (ALD2) and acetyl-coenzyme 
A(CoA) synthase (ACSJ), which func- 
tion together to convert the products of 
alcohol dehydrogenase into acetyl-CoA, 
which in turn is used to fuel the tricarbox- 
ylic acid (TCA) cycle and the glyoxylate 
cycle. The concomitant shutdown of tran- 
scription of the genes encoding pyruvate 
decarboxylase and induction of pyruvate 
carboxylase rechannels pyruvate away 
from acetaldehyde, and instead to oxalac- 
etate, where it can serve to supply the 
TCA cycle and gluconeogenesis. Induc- 
tion of the pivotal genes PCKl , encoding 
phosphoenolpyruvate carboxykinase, and 
FBPJ, encoding fructose 1,6-biphos- 
phatase, switches the directions of two key 
irreversible steps in glycolysis, reversing 
the flow of metabolites along the revers- 
ible steps of the glycolytic pathway toward 
the essential biosynthetic precursor, glu- 
coses-phosphate. Induction of the genes 
coding for the trehalose synthase and gly- 
cogen synthase complexes promotes chan- 
neling of glucose-6-phosphate into these 
carbohydrate storage pathways. 

Just as the changes in expression of 
genes encoding pivotal enzymes can pro- 
vide insight into metabolic reprogram- 
ming, the behavior of large groups of func- 
tionally related genes can provide a broad 
view of the systematic way in which the 
yeast cell adapts to a changing environ- 
ment (Fig. 4). Several classes of genes, 
such as cytochrome c-related genes and 
those involved in the TCA/glyoxylate cy- 
cle and carbohydrate storage, were coordi- 
nately induced by glucose exhaustion. In 
contrast, genes devoted to protein synthe- 
sis, including ribosomal proteins, tRNA 
synthetases, and translation, elongation, 
and initiation factors, exhibited a coordi- 
nated decrease in expression. More than 
95% of ribosomal genes showed at least 
twofold decreases in expression during the 
diauxic shift (Fig. 4) (13). A noteworthy 
and illuminating exception was that the 



genes encoding mitochondrial ribosomal 
genes were generally induced rather than 
repressed after glucose limitation, high- 
lighting the requirement for mitchondrial 
biogenesis (13). As more is learned about 
the functions of every gene in the yeast 
genome, the ability to gain insight into a 
cell's response to a changing environment 
through its global gene expression patterns 
will become increasingly powerful. 

Several distinct temporal patterns of ex- 
pression could be recognized, and sets of 
genes could be grouped on the basis of the 
similarities in their expression patterns. The 
characterized members of each of these 
groups also shared important similarities in 
their functions. Moreover, in most cases, 
common regulatory mechanisms could be 
inferred for sets of genes with similar expres- 
sion profiles. For example, seven genes 
showed a late induction profile, with mRNA 
levels increasing by more than ninefold at 



the last timepoint but less than threefold at 
the preceding timepoint (Fig. 5B). All of 
these genes were known to be glucose-re- 
pressed, and five of the seven were previously 
noted to share a common upstream activat- 
ing sequence (UAS), the carbon source re- 
sponse element (CSRE) (J 6-20). A search 
in the promoter regions of the remaining two 
genes, ACR1 and IDP2, revealed that 
ACRJ, a gene essential for ACSJ activity, 
also possessed a consensus CSRE motif, but 
interestingly, 1DP2 did not. A search of the 
entire yeast genome sequence for the con- 
sensus CSRE motif revealed only four addi- 
tional candidate genes, none of which 
showed a similar induction. 

Examples from additional groups of 
genes that shared expression profiles are 
illustrated in Fig. 5, C through F. The 
sequences upstream of the named genes in 
Fig. 5C all contain stress response ele- 
ments (STRE), and with the exception 




Fig. 1. Yeast genome microarray. The actual size of the microarray is 18 mm by 18 mm. The 
microarray was printed as described {9). This image was obtained with the same fluorescent 
scanning confocal microscope used to collect all the data we report {49). A fluorescently labeled 
cDNA probe was prepared from mRNA isolated from cells harvested shortly after inoculation (culture 
density of <5 x 10 8 cells/ml and media glucose level of 19 g/liter) by reverse transcription in the 
presence of Cy3-dUTP. Similarly, a second probe was prepared from mRNA isolated from cells taken 
from the same culture 9.5 hours later (culture density of -2 x 10 8 cells/ml, with a glucose level of 
<0.2 g/liter) by reverse transcription in the presence of Cy5-dUTP. In this image, hybridization of the 
Cy3-dUTP-labeled cDNA (that is, mRNA expression at the initial timepoint) is represented as a green 
signal, and hybridization of Cy5-dUTP-labeled cDNA (that is, mRNA expression at 9.5 hours) is 
represented as a red signal. Thus, genes induced or repressed after the diauxic shift appear in this 
image as red and green spots, respectively. Genes expressed at roughly equal levels before and after 
the diauxic shift appear in this image as yellow spots. 
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of HSP42, have previously been shown to 
be controlled at least in part by these 
elements (21-24). Inspection of the se- 
quences upstream of HSP42 and the two 
uncharacterized genes shown in Fig. 5C, 
YKL026c, a hypothetical protein with 
similarity to glutathione peroxidase, and 
YGR043c, a putative transaldolase, re- 
vealed that each of these genes also pos- 
sess repeated upstream copies of the stress- 
responsive CCCCT motif. Of the 13 ad- 
ditional genes in the yeast genome that 
shared this expression profile [including 
HSP30, ALD2, OM45, and 10 uncharac- 
terized ORFs (25)], nine contained one or 
more recognizable STRE sites in their up- 
stream regions. 

The heterotrimeric transcriptional acti- 
vator complex HAP2 f 3,4 has been shown 
to be responsible for induction of several 
genes important for respiration (26-28). 
This complex binds a degenerate consensus 
sequence known as the CCAAT box (26). 
Computer analysis, using the consensus se- 
quence TNRYTGGB (29), has suggested 
that a large number of genes involved in 
respiration may be specific targets of 
HAP2,3,4 (30). Indeed, a putative 
HAP2,3,4 binding site could be found in 
the sequences upstream of each of the seven 
cytochrome c-related genes that showed 
the greatest magnitude of induction (Fig. 
5D). Of 12 additional cytochrome c-related 
genes that were induced, HAP2,3 t 4 binding 
sites were present in all but one.. Signifi- 
cantly, we found that transcription of 
HAP4 itself was induced nearly ninefold 
concomitant with the diauxic shift. 

Control of ribosomal protein biogenesis 
is mainly exerted at the transcriptional 
level, through the presence of a common 
upstream-activating element (UAS ) 
that is recognized by the Rapl DNA-bind- 
ing protein (31, 32). The expression pro- 
files of seven ribosomal proteins are shown 
in Fig. 5F. A search of the sequences 
upstream of all seven genes revealed con- 
sensus Rapl -binding motifs (33). It has 
been suggested that declining Rapl levels 
in the cell during starvation may be re- 
sponsible for the decline in ribosomal pro- 
tein gene expression (34). Indeed, we ob- 
served that the abundance of RAP I 
mRNA diminished by 4-4-fold, at about 
the time of glucose exhaustion. 

Of the 149 genes that encode known or 
putative transcription factors, only two, 
HAP4 and S/P4, were induced by a factor of 
more than threefold at the diauxic shift. 
S1P4 encodes a DNA-binding transcrip- 
tional activator that has been shown to 
interact with Snfl , the "master regulator" of 
glucose repression (35). The eightfold in- 
duction of SIP4 upon depletion of glucose 
strongly suggests a role in the induction of 



downstream genes at the diauxic shift. 

Although most of the transcriptional 
responses that we observed were not pre- 
viously known, the responses of many 
genes during the diauxic shift have been 
described. Comparison of the results we 
obtained by DNA microarray hybridiza- 
tion with previously reported results there- 
fore provided a strong test of the sensitiv- 
ity and accuracy of this approach. The 
expression patterns we observed for previ- 
ously characterized genes showed almost 
perfect concordance with previously pub- 
lished results (36). Moreover, the differ- 
ential expression measurements obtained 
by DNA microarray hybridization were re- 
producible in duplicate experiments. For 
example, the remarkable changes in gene 
expression between cells harvested imme- 
diately after inoculation and immediately 
after the diauxic shift (the first and sixth 
intervals in this time series) were mea- 
sured in duplicate, independent DNA mi- 
croarray hybridizations. The correlation 
coefficient for two complete sets of expres- 
sion ratio measurements was 0.87, and for 
more than 95% of the genes, the expres- 



sion ratios measured in these duplicate 
experiments differed by less than a factor 
of 2. However, in a few cases, there were 
discrepancies between our results and pre- 
vious results, pointing to technical limita- 
tions that will need to be addressed as 
DNA microarray technology advances 
(37, 38). Despite the noted exceptions, 
the high concordance between the results 
we obtained in these experiments and 
those of previous studies provides confi- 
dence in the reliability and thoroughness 
of the survey. 

The changes in gene expression during 
this diauxic shift are complex and involve 
integration of many kinds of information 
about the nutritional and metabolic state 
of the cell. The large number of genes 
whose expression is altered and the diver- 
sity of temporal expression profiles ob- 
served in this experiment highlight the 
challenge of understanding the underlying 
regulatory mechanisms. One approach to 
defining the contributions of individual 
regulatory genes to a complex program of 
this kind is to use DNA microarrays to 
identify genes whose expression is affected 



Fig. 2. The section of the ar- 
ray indicated by the gray box 
in Fig. 1 is shown for each of 
the experiments described 
here. Representative genes 
are labeled. In each of the ar- 
rays used to analyze gene 
expression during the diauxic 
shift, red spots represent 
genes that were induced rel- 
ative to the initial timepoint, 
and green spots represent 
genes that were repressed 
relative to the initial timepoint. 
In the arrays used to analyze 
the effects of the tup 1 A mu- 
tation and YAP1 overexpres- 
sion, red spots represent 
genes whose expression was 
increased, and green spots 
represent genes whose ex- 
pression was decreased by 
the genetic modification. Note 
that distinct sets of genes are 
induced and repressed in the 
different experiments. The 
complete images of each of 
these arrays can be viewed on 
the Internet {13). Cell density 
as measured by optical densi- 
ty {OD) at 600 nm was used to 
measure the growth of the 
culture. 
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by mutations in each putative regulatory 
gene. As a test of this strategy, we analyzed 
the genomewide changes in gene expression 
that result from deletion of the TUP I gene. 
Transcriptional repression of many genes by 
glucose requires the DNA-binding repressor 



Migl and is mediated by recruiting the tran- 
scriptional co-repressors Tupl and Cyc8/ 
Ssn6 (39). Tupl has also been implicated in 
repression of oxygen-regulated, mating-type- 
specific, and DNA-damage-inducible genes 
(40). 




Pentose Phosphate 
V*-GLlJ-6-P — Pathway, RNA, DNA, 
proteins 

PGI1 I 




Fig. 3. Metabolic reprogramming inferred from global analysis of changes in gene expression. Only key 
metabolic intermediates are identified. The yeast genes encoding the enzymes that catalyze each step 
in this metabolic circuit are identified by name in the boxes. The genes encoding succinyl-CoA synthase 
and glycogen-debranching enzyme have not been explicitly identified, but the ORFs YGR244 and 
YPR184 show significant homology to known succinyl-CoA synthase and glycogen-debranching en- 
zymes, respectively, and are therefore included in the corresponding steps in this figure. Red boxes with 
white lettering identify genes whose expression increases in the diauxic shift. Green boxes with dark 
green lettering identify genes whose expression diminishes in the diauxic shift. The magnitude of 
induction or repression is indicated for these genes. For multimeric enzyme complexes, such as 
succinate dehydrogenase, the indicated fold-induction represents an unweighted average of all the 
genes listed in the box. Black and white boxes indicate no significant differential expression {less than 
twofold). The direction of the arrows connecting reversible enzymatic steps indicate the direction of the 
flow of metabolic intermediates, inferred from the gene expression pattern, after the diauxic shift. Arrows 
representing steps catalyzed by genes whose expression was strongly induced are highlighted in red. 
The broad gray arrows represent major increases in the flow of metabolites after the diauxic shift, 
inferred from the indicated changes in gene expression. 



REPORTS 



Wild-type yeast cells and cells bearing 
a deletion of the TUP] gene (tupJ A) were 
grown in parallel cultures in rich medium 
containing glucose as the carbon source. 
Messenger RNA was isolated from expo- 
nentially growing cells from the two pop- 
ulations and used to prepare cDNA la- 
beled with Cy3 (green) and Cy5 (red), 
respectively (11). The labeled probes were 
mixed and simultaneously hybridized to 
the microarray. Red spots on the microar- 
ray therefore represented genes whose 
transcription was induced in the tup] A 
strain, and thus presumably repressed by 
Tupl (41 )- A representative section of the 
microarray (Fig. 2, bottom middle panel) 
illustrates that the genes whose expression 
was affected by the tup] A mutation, were, 
in general, distinct from those induced 
upon glucose exhaustion [complete images 
of all the arrays shown in Fig. 2 are avail- 
able on the Internet (13)]. Nevertheless, 
34 (10%) of the genes that were induced 
by a factor of at least 2 after the diauxic 
shift were similarly induced by deletion of 
TUP] , suggesting that these genes may be 
subject to TUP1 -mediated repression by 
glucose. For example, SUC2, the gene en- 
coding invertase, and all five hexose trans- 
porter genes that were induced during the 
course of the diauxic shift were similarly 
induced, in duplicate experiments, by the 
deletion of TUPL 

The set of genes affected by Tupl in this 
experiment also included ot-glucosidases, 
the mating- type— specific genes MFAJ and 
MFA2, and the DNA damage-inducible 
RNR2 and RNR4, as well as genes involved 
in flocculation and many genes of unknown 
function. The hybridization signal corre- 
sponding to expression of TUP1 itself was 
also severely reduced because of the (in- 
complete) deletion of the transcription unit 
in the tup] A strain, providing a positive 
control in the experiment (42). 

Many of the transcriptional targets of 
Tupl fell into sets of genes with related 
biochemical functions. For instance, al- 
though only about 3% of all yeast genes 
appeared to be TUP] -repressed by a factor 
of more than 2 in duplicate experiments 
under these conditions, 6 of the 13 genes 
that have been implicated in flocculation 
(15) showed a reproducible increase in 
expression of at least twofold when TUP1 
was deleted. Another group of related 
genes that appeared to be subject to TUP I 
repression encodes the serine-rich cell 
wall mannoproteins, such as Tipl and 
Tirl/Srpl which are induced by cold 
shock and other stresses (43), and similar, 
serine-poor proteins, the seripauperins 
(44). Messenger RNA levels for 23 of the 
26 genes in this group were reproducibly 
elevated by at least 2.5-fold in the tup J A 
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strain, and 18 of these genes were induced 
by more than sevenfold when TUP I was 
deleted. In contrast, none of 83 genes that 
could be classified as putative regulators of 
the cell division cycle were induced more 
than twofold by deletion of TUP J. Thus, 
despite the diversity of the regulatory sys- 
tems that employ Tupl, most of the genes 
that it regulates under these conditions 
fall into a limited number of distinct func- 
tional classes. 

Because the microarray allows us to 
monitor expression of nearly every gene in 
yeast, we can, in principle, use this ap- 
proach to identify all the transcriptional 
targets of a regulatory protein like Tupl. It 
is important to note, however, that in any 
single experiment of this kind we can only 
recognize those target genes that are nor- 
mally repressed (or induced) under the 
conditions of the experiment. For in- 
stance, the experiment described here an- 
alyzed a MAT a strain in which MFAJ 
and MFA2, the genes encoding the a- 
factor mating pheromone precursor, are 
normally repressed. In the isogenic tup J A 
strain, these genes were inappropriately 
expressed, reflecting the role that Tupl 
plays in their repression. Had we instead 
carried out this experiment with a MAT A 
strain (in which expression of MFAl and 
MFA2 is not repressed), it would not have 
been possible to conclude anything re- 
garding the role of Tupl in the repression 
of these genes. Conversely, we cannot dis- 
tinguish indirect effects of the chronic 
absence of Tupl in the mutant strain from 
effects directly attributable to its partici- 
pation in repressing the transcription of a 
gene. 

Another simple route to modulating the 
activity of a regulatory factor is to overex- 
press the gene that encodes it. YAP1 en- 
codes a DNA-binding transcription factor 
belonging to the b-zip class of DNA-bind- 
ing proteins. Overexpression of YAPl in 
yeast confers increased resistance to hydro- 
gen peroxide, o-phenanthroline, heavy 
metals, and osmotic stress (45). We ana- 
lyzed differential gene expression between a 
wild- type strain bearing a control plasmid 
and a strain with a plasmid expressing YAPl 
under the control of the strong GAL1-10 
promoter, both grown in galactose (that is, 
a condition that induces YAPl overexpres- 
sion). Complementary DNA from the con- 
trol and YAPl overexpressing strains, la- 
beled with Cy3 and Cy5, respectively, was 
prepared from mRNA isolated from the two 
strains and hybridized to the microarray. 
Thus, red spots on the array represent genes 
that were induced in the strain overexpress- 
ing YAPJ. 

Of the 17 genes whose mRNA levels 
increased by more than threefold when 



YAPl was overexpressed in this way, five 
bear homology to aryl-alcohol oxidoreduc- 
tases (Fig. 2 and Table 1). An additional 
four of the genes in this set also belong to 
the general class of dehydrogenases/oxi- 
doreductases. Very little is known about 
the role of aryl-alcohol oxidoreductases in 
S. cerevtsiae, but these enzymes have been 
isolated from ligninolytic fungi, in which 
they participate in coupled redox reac- 
tions, oxidizing aromatic, and aliphatic 
unsaturated alcohols to aldehydes with the 
production of hydrogen peroxide (46, 47). 
The fact that a remarkable fraction of the 
targets identified in this experiment be- 
long to the same small, functional group of 
oxidoreductases suggests that these genes 



might play an important protective role 
during oxidative stress. Transcription of a 
small number of genes was reduced in the 
strain overexpressing Yapl. Interestingly, 
many of these genes encode sugar per- 
meases or enzymes involved in inositol 
metabolism. 

We searched for Yapl -binding sites 
(TTACTAA or TGACTAA) in the se- 
quences upstream of the target genes we 
identified (48). About two-thirds of the 
genes that were induced by more than 
threefold upon Yapl overexpression had 
one or more binding sites within 600 bases 
upstream of the start codon (Table 1), sug- 
gesting that they are directly regulated by 
Yapl. The absence of canonical Yapl-bind- 



Fig. 4. Coordinated reg- 
ulation of functionally re- 
lated genes. The curves 
represent the average in- 
duction or repression ra- 
tios for all the genes in 
each indicated group. 
The total number of 
genes in each group was 
as follows: ribosomal 
proteins, 112; translation 
elongation and initiation 

factors, 25; tRNA synthetases (excluding mitochondial synthetases), 17; glycogen and trehalose syn- 
thesis and degradation, 15; cytochrome c oxidase and reductase proteins, 19; and TCA- and gtyoxy- 
late-cycle enzymes, 24. 




Glycogen/Trehalose 
-•-Cytochromes: 
— o— TCA / GJyoxalate cycle 



-">- Ribosomal proteins 
—•—Translation elongatJon/lnit. 
— o- tRNA synthetase 



Table 1 . Genes induced by YAP1 overexpression. This list includes all the genes for which mRNA levels 
increased by more than twofold upon YAP1 overexpression in both of two duplicate experiments, and 
for which the average increase in mRNA level in the two experiments was greater than threefold (50). 
Positions of the canonical Yapl binding sites upstream of the start codon, when present, and the 
average fold-increase in mRNA levels measured in the two experiments are indicated. 



ORF 


Distance of Yapl 
site from ATG 


Gene 


Description 


Fold- 
increase 


YNL331C 






Putative aryl-alcohol reductase 


12.9 


YKL071W 


162-222 (5 sites) 




Similarity to bacterial csgA protein 


10.4 


YML007W 




YAP1 


Transcriptional activator involved in 
oxidative stress response 


9.8 


YFL056C 


223, 242 




Homology to aryl-alcohol 
dehydrogenases 


9.0 


YLL060C 


98 




Putative glutathione transferase 


7.4 


YOL165C 


266 




Putative aryl-alcohol dehydrogenase 
(NADP+) 


7.0 


YCR107W 






Putative aryl-alcohol reductase 


6.5 


YML116W 


409 


ATR1 


Aminotriazole and 4-nitroquinoline 
resistance protein 


6.5 


YBR008C 


142, 167,364 




Homology to benomyl/methotrexate 
resistance protein 


6.1 


YCLX08C 






Hypothetical protein 


6.1 


YJR155W 






Putative aryl-alcohol dehydrogenase 


6.0 


YPL171C 


148, 212 


OYE3 


NAPDH dehydrogenase (old yellow 
enzyme), isoform 3 


5.8 


YLR460C 


167, 317 




Homology to hypothetical proteins 
YCR102C and YNL134C 


4.7 


YKR076W 


178 




Homology to hypothetical protein 
YMR251w 


4.5 


YHR179W 


327 


OYE2 


NAD(P)H oxidoreductase (old yellow 
enzyme), isoform 1 


4.1 


YML131W 


507 




Similarity to A thaliana zeta-crystallin 
homolog 


3.7 


YOL126C 




MDH2 


Malate dehydrogenase 


3.3 
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ing sites upstream of the others may reflect 
an ability of Yapl to bind sites that differ 
from the canonical binding sites, perhaps in 
cooperation with other factors, or less like- 
ly, may represent an indirect effect of Yapl 
overexpression, mediated by one or more 
intermediary factors. Yapl sites were found 
only four times in the corresponding region 
of an arbitrary set of 30 genes that were not 
differentially regulated by Yapl. 

Use of a DNA microarray to character- 
ize the transcriptional consequences of 
mutations affecting the activity of regula- 
tory molecules provides a simple and pow- 
erful approach to dissection and character- 
ization of regulatory pathways and net- 



works. This strategy also has an important 
practical application in drug screening. 
Mutations in specific genes encoding can- 
didate drug targets can serve as surrogates 
for the ideal chemical inhibitor or modu- 
lator of their activity. DNA microarrays 
can be used to define the resulting signa- 
ture pattern of alterations in gene expres- 
sion, and then subsequently used in an 
assay to screen for compounds that repro- 
duce the desired signature pattern. 

DNA microarrays provide a simple and 
economical way to explore gene expres- 
sion patterns on a genomic scale. The 
hurdles to extending this approach to any 
other organism are minor. The equipment 



required for fabricating and using DNA 
microarrays (9) consists of components 
that were chosen for their modest cost and 
simplicity. It was feasible for a small group 
to accomplish the amplification of more 
than 6000 genes in about 4 months and, 
once the amplified gene sequences were in 
hand, only 2 days were required to print a 
set of 110 microarrays of 6400 elements 
each. Probe preparation, hybridization, 
and fluorescent imaging are also simple 
procedures. Even conceptually simple ex- 
periments, as we described here, can yield 
vast amounts of information. The value of 
the information from each experiment of 
this kind will progressively increase as 
more is learned about the functions of 
each gene and as additional experiments 
define the global changes in gene expres- 
sion in diverse other natural processes and 
genetic perturbations. Perhaps the greatest 
challenge now is to develop efficient 
methods for organizing, distributing, inter- 
preting, and extracting insights from the 
large volumes of data these experiments 
will provide. 

REFERENCES AND NOTES 



1. M. Schena, D. Shalon, R W. Davis, P. O. Brown, 
Science 270, 467 (1995). 

2. D. Shalon. S. J. Smith, P. O. Brown, Genome Res. 6. 
639 (1996). 

3. D. Lashkari, ^oc. Natl. Acad. So. USA., in press. 

4. J. DeRisi et a/., Nature Genet 14, 457 (1996). 

5. D. J. Lockhart et a/., Nature BiotechnoL 14, 1675 
(1996). 

6. M. Che© et at. t Science 274, 610 (1996). 

7. M. Johnston and M. Carlson, in The Molecular Biol- 
ogy of the Yeast Saccharomyces: Gene Expression, 
E. W. Jones, J. R Pringle, J. R Broach, Eds. (Cold 
Spring Harbor Laboratory Press. Cold Spring Har- 
bor, NY, 1992), p. 193. 

8. Primers for each known or predicted protein coding 
sequence were supplied by Research Genetics. 
PCR was performed with the protocol supplied by 
Research Genetics, using genomic DNA from yeast 
strain S288C as a template. Each PCR product was 
verified by agarose gel electrophoresis and was 
deemed correct if the lane contained a single band of 
appropriate mobility. Failures were marked as such 
in the database. The overall success rate for a single- 
pass amplification of 61 16 ORFs was -94.5%. 

9. Glass slides (Gold Seal) were cleaned for 2 hours in a 
solution of 2 N NaOH and 70% ethanol. After rinsing 
in distilled water, the slides were then treated with a 
1 :5 dilution of poly-L-lysine adhesive solution (Sig- 
ma) for 1 hour, and then dried for 5 min at 40°C in a 
vacuum oven, DNA samples from 100-pJ PCR reac- 
tions were purified by ethanol purification in 96 -well 
microtiter plates. The resulting precipitates were re- 
suspended in 3x standard saline citrate (SSC) and 
transferred to new plates for arraying. A custom-built 
arraying robot was used to print on a batch of 1 10 
slides. Details of the design of the micro array er are 
available at cmgm.stanford.edu/pbrown. After print- 
ing, the microarrays were rehydrated for 30 s in a 
humid chamber and then snap-dried for 2 s on a hot 
plate (100°C). The DNA was then ultraviolet (Uv> 
crossBnked to the surface by subjecting the slides to 
60 mJ of energy (Stratagene Stratalinker). The rest of 
the poly-L-lysine surface was blocked by a 15-min 
incubation in a solution of 70 mM succinic anhydride 
dissolved in a solution consisting of 315 ml of 1- 
methyl-2-pyrroiidinone (Aldrich) and 35 ml of 1 M 
boric acid (pH 8.0). Directly after the blocking reac- 



A B 




Time (hours) 

Fig. 5. Distinct temporal patterns of induction or repression help to group genes that share regulatory 
properties. (A) Temporal profile of the cell density, as measured by OD at 600 nm and glucose 
concentration in the media. (B) Seven genes exhibited a strong induction {greater than ninefold) only at 
the last timepoint (20.5 hours). With the exception of IDP2, each of these genes has a CSRE UAS. There 
were no additional genes observed to match this profile. (C) Seven members of a class of genes marked 
by earty induction with a peak in mRNA levels at 18.5 hours. Each of these genes contain STRE motif 
repeats in their upstream promoter regions. (D) Cytochrome c oxidase and ubiquinol cytochrome c 
reductase genes. Marked by an induction coincident with the diauxic shift, each of these genes contains 
a consensus binding motif for the HAP2,3,4 protein complex. At least 17 genes shared a simitar 
expression profile. (E) SAMh GPP1, and several genes of unknown function are repressed before the 
diauxic shift, and continue to be repressed upon entry into stationary phase. (F) Ribosomal protein 
genes comprise a large class of genes that are repressed upon depletion of glucose. Each of the genes 
profiled here contains one or more RAP1 -binding motifs upstream of its promoter. RAP1 is a transcrip- 
tional regulator of most ribosomal proteins. 



www.sciencemag.org • SCIENCE • VOL. 278 • 24 OCTOBER 1997 



685 



tlon, the bound DNA was denatured by a 2-min in- 
cubation In distilled water at ~95°C. The slides were 
then transferred into a bath of 1 00% ethanol at room 
temperature, rinsed, and then spun dry in a clinical 
centrifuge. Sfides were stored in a closed box at 
room temperature until used. 

10. YPD medium (8 liters), in a 10-liter fermentation 
vessel, was inoculated with 2 ml of a fresh over- 
night culture of yeast strain DBY7286 (MATa, ura3, 
GAL2). The fermentor was maintained at 30°C with 
constant agitation and aeration. The glucose con- 
tent of the media was measured with a UV test kit 
(Boehringer Mannheim, catalog number 716251) 
Cell density was measured by OD at 600-nm wave- 
length. Afiquots of culture were rapidly withdrawn 
from the fermentation vessel by peristaltic pump, 
spun down at room temperature, and then flash 
frozen with liquid nitrogen. Frozen cells were stored 
at -80°C. 

11. Cy3-dUTP or Cy5-dUTP (Amersham) was incorpo- 
rated during reverse transcription of 1.25 ftg of 
polyadenytated [poly(A) + ] RNA, primed by a dT(16) 
oligomer. This mixture was heated to 70°C for 10 
min, and then transferred to ice. A premixed solu- 
tion, consisting of 200 U Superscript II (Gibco), 
buffer, deoxyribonucleoside triphosphates, and flu- 
orescent nucleotides, was added to the RNA. Nu- 
cleotides were used at these final concentrations: 
500 jjlM for dATP, dCTP. and dGTP and 200 pJM 
for dTTP. Cy3-dUTP and Cy5-dUTP were used at 
a final concentration of 1 00 ^M. The reaction was 
then incubated at 42*C for 2 hours. Unincorporat- 
ed fluorescent nucleotides were removed by first 
diluting the reaction mixture with of 470 fJ of 10 
mM tris-HCI (pH 8.0)/1 mM EDTA and then subse- 
quently concentrating the mix to -5 nl, using Cen- 
tricon-30 microconcentrators (Arnicon). 

1 2. Purified, labeled cDN A was resuspended in 1 1 of 
3.5x SSC containing 10 p.g poly(dA) and 0.3 p.! of 
10% SDS. Before hybridization, the solution was 
boiled for 2 min and then allowed to cool to room 
temperature. The sotution was applied to the mi- 
croarray under a cover slip, and the slide was 
placed in a custom hybridization chamber which 
was subsequently incubated for -8 to 1 2 hours in 
a water bath at 62°C. Before scanning, slides were 
washed in 2x SSC, 0.2% SDS for 5 min, and then 
0.05X SSC for 1 min. Slides were dried before 
scanning by centrifugation at 500 rpm in a Beck- 
man CS-6R centrifuge. 

1 3. The complete data set is available on the Internet at 
cmgjn.stanrford.edu/pbrovm/exDlwe/index.htmt 

1 4. For 95% of all the genes analyzed, the mRNA levels 
measured in cells harvested at the first and second 
interval after inoculation differed by a factor of less 
than 1.5. The correlation coefficient for the compar- 
ison between mRNA levels measured for each gene 
in these two different mRNA samples was 0.98. 
When duplicate mRNA preparations from the same 
cell sample were compared in the same way, the 
correlation coefficient between the expression levels 
measured for the two samples by comparative hy- 
bridization was 0.99. 

15. The numbers and identities of known and putative 
genes, and their homologies to other genes, were 
gathered from the following public databases; Sac- 
charomyces Genome Database (genome-www. 
stanford.edu), Yeast Protein Database (quest7. 
proteome.com), and Munich Information Centre for 
Protein Sequences (speedy.mips.biochem.mpg.de/ 
mips/yeast/index.htmtx). 

16. A Scholar and H. J. Schuller, Md. Ceil. Biol. 14, 
3613(1994). 

1 7. S. Kratzer and H. J. SchuDer, Gene 161 , 75 (1995). 

18. R. J. Hasefbeck and H. L. McAlister, J. Biol. Chem. 
268, 12116(1993). 

19. M. Fernandez, E. Fernandez, R. Rodicio, Md. Gen. 
Genet. 242, 727 (1994). 

20. A. Hartjgef a/., Nudeic Adds Res, 20, 5677 (1992). 

21. P. M. Martinez era/.. EMBOJ. 15, 2227 (1996). 

22. J. C. Varela, U. M. Praekelt, P. A. Meacock, R. J. 
Ptanta, W. H. Mager, Mol. Cell. Biol. 1 5, 6232 (1995). 

23. H. Ruis and C. Schuller, Bioessays 1 7, 959 (1 995). 

24. J. L. Parrou, M. A. Teste, J. Francois, Microbiology 
143, 1891 (1997). 



25. This expression profile was defined as having an 
induction of greater than 10-fold at 18.5 hours and 
less than 1 1 -fold at 20.5 hours. 

26. S. L Forsburg and L. Guarente, Genes Dev. 3, 11 66 
(1989). 

27. J. T. Olesenand L Guarente, ibid 4, 1714 £1990). 

28. M. Rosenkrantz, C. S. Kell. E. A. Pennell, L J. De- 
venish. Md. Microbid. 13, 1 19 (1994). 

29. Single-letter abbreviations for the amino acid resi- 
dues are as foDows: A. Ala; C, Cys; D, Asp; E, Glu; F. 
Phe; G, Gry, H, His; I, lie; K, Lys; L, Leu; M, Met; N, 
Asn; P, Pro; Q. Gin; R, Arg; S. Ser; T, Thr; V, Val; W, 
Trp; and Y, Tyr. The nucleotide codes are as follows: 
B-C. G, or T; N-G. A, T, or C; R-Aor G; and Y-C or 
T. 

30. C. Fond rat and A. Kalogeropoulos, Comput. Appl. 
Biosd. 12. 363(1996). 

31. D. Shore, Trends Genet. 10. 408 (1994). 

32. R. J. Ptanta and K A, Raue, ibid 4, 64 (1988). 

33. The degenerate consensus sequence VYCYRNNC- 
MNH was used to search for potential RAP1 -binding 
sites. The exact consensus, as defined by (30), is 
WACAYCCRTAC AT Y W, with up to three differenc- 
es allowed. 

34. S. F. Neuman, S. Bhattacharya, J. R. Broach, Mol. 
Cell. Biol 15, 3187 (1995). 

35. P. Lesage. X Yang. M. Carlson, ibid. 16, 1921 
(1996). 

36. For example, we observed large inductions of the 
genes coding for PCK1, FBP1 |Z. Yin ef a/., Mol. 
Microbid. 20, 751 (1996)], the central glyoxylate 
cycle gene ICL1 (A. Scholer and H. J. Schuller, 
Curr. Genet. 23, 375 (1993)], and the "aerobic" 
isoform of acetyl-CoA synthase, ACS1 |M. A. van 
den Berg ef at. , J. Biol. Chem. 271 , 28953 (1 996)]. 
with concomitant down- regulation of the glycolyt- 
ic-specific genes PYK1 and PFK2 |P. A. Moore ef 
a/., Md. Cell. &d. 11. 5330 (1991)]. Other genes 
not directly invotved in carbon metabolism but 
known to be induced upon nutrient [imitation in- 
clude genes encoding cytosblic catalase T C7T7 
|P. H. Bissinger et at., ibid. 9, 1309 (1989)] and 
several genes encoding small heat-shock proteins, 
such as HSP12, HSP26. and HSP42 p. Farkas ef 
a/., J. Bid. Chem. 266, 15602 (1991); U. M. 
Praekelt and P. A, Meacock, Mol. Gen. Genet. 223, 
97 (1990); D. Wotton et al., J. Biol. Chem. 271, 
2717(1996)]. 

37. The levels of induction we measured for genes that 
were expressed at very tow levels in the uninduced 
state (notably, FBP1 and PCK1) were generally lower 
than those previously reported. This discrepancy 
was likely due to the conservative background sub- 
traction method we used, which generally resulted in 
overestimation of very low expression levels {46}. 

38. Cross-hybridization of highly related sequences can 
also occasionally obscure changes in gene expres- 
sion, an important concern where members of gene 
families are functionally specialized and differentially 
regulated. The major alcohol dehydrogenase genes, 
AD HI and ADH2, share 88% nucleotide identity. 
Reciprocal regulation of these genes is an important 
feature of the diauxic shift, but was not observed in 
this experiment, presumably because of cross-hy- 
bridization of the fluorescent cDNAs representing 
these two genes. Nevertheless, we were able to de- 
tect differential expression of closely related isoforms 
of other enzymes, such as HXK1/HXK2 (77% iden- 
tical) p. Herrero et al. , Yeast 1 1 , 1 37 (1 995)]. MLSV 
DAL7 (73% identical) (20), and PGM1/PGM2 (72% 
identical) [D. Oh, J. E. Hopper, Md. CeB. Bid. 10, 
1415(1 990)], in accord with previous studies. Use in 
the microarray of deliberately selected DNA se- 
quences corresponding to the most divergent seg- 
ments of homologous genes, in lieu of the complete 
gene sequences, should relieve this problem in many 
cases. 

39. F. E. Williams, U. Varanasi, R. J. Trumbly, Md. Cdi. 
Bid. 11,3307 (1991). 

40. D. Tzamarias and K. Struhl. Nature 369, 758 (1994). 

41. Differences in mRNA levels between the tuplA and 
wild-type strain were measured in two independent 
experiments. The correlation coefficient between the 
complete sets of expression ratios measured in 
these duplicate experiments was 0.83. The concor- 



dance between the sets of genes that appeared to 
be induced was very high between the two experi- 
ments. When only the 355 genes that showed at 
least a twofold increase in mRNA in the tupl A strain 
in either of the duplicate experiments were com- 
pared, the correlation coefficient was 0.82. 

42. The tuplA mutation consists of an insertion of the 
LEU2 coding sequence, induding a stop codon, be- 
tween the ATG of TUP1 and an Eco R I site 1 24 base 
pairs before the stop codon of the TUP1 gene. 

43. L R. Kowalski, K. Kondo. M. Inouye. Md. Microbid. 
15,341 (1995). 

44. M. vlswanathan, G. Muthukumar, Y. S. Cong, J. 
Lenard, Gene 148, 149 (1994). 

45. D. Hirata, K Yano, T. Miyakawa, Md. Gen. Genet 
242, 250 (1994). 

46. A. Gutierrez, L Caramelo, A. Prieto, M. J. Martinez, 
A. T. Martinez, Appl. Environ. Microbid. 60, 1783 
(1994). 

47. A. Muheim ef al. , Eur, J. Biochem. 195, 369 (1991). 

48. J. A Wemmie. M. S, Szczypka, D. J, Thiele, W. S. 
Moye-Rowley, J. Bid. Chem, 269, 32592 (1994). 

49. Microarrays were scanned using a custom-built 
scanning laser microscope built by S. Smith with 
software written by N. 3v. Details concerning scan- 
ner design and construction are available at cmgm. 
stanford.edu/pbrown. images were scanned at a 
resolution of 20 \um per pxeJ. A separate scan, using 
the appropriate excitation fine, was done for each of 
the two fiuorophores used. During the scanning pro- 
cess, the ratio between the signals in the two chan- 
nels was calculated for several array elements con- 
taining total genomic DNA. To normalize the two 
channels with respect to overall intensity, we then 
adjusted photomultipfier and laser power settings 
such that the signal ratio at these elements was as 
close to 1 .0 as possible. The combined images were 
analyzed with custom-written software. A bounding 
box, fitted to the size of the DNA spots in each 
quadrant, was placed over each array element. The 
average fluorescent intensity was calculated by sum- 
ming the Intensities of each pixel present in a bound- 
ing box, and then dividing by the total number of 
pixels. Local area background was calculated for 
each array element by determining the average fluo- 
rescent intensity for the lower 20% of pixel intensi- 
ties. Although this method tends to underestimate 
the background, causing an underestimation of ex- 
treme ratios, it produces a very consistent and noise- 
tolerant approximation. Although the analog-to- 
digital board used for data collection possesses a 
wide dynamic range (12 bitsX several signals were 
saturated (greater than the maximum signal intensity 
allowed) at the chosen settings. Therefore, extreme 
ratios at bright elements are generally underestimat- 
ed. A signal was deemed significant if the average 
intensity after background subtraction was at least 
2.5-fold higher than the standard deviation in the 
background measurements for all elements on the 
array. 

50. In addition to the 17 genes shown in Table 1, three 
additional genes were induced by an average of 
more than threefold in the duplicate experiments, but 
in one of the two experiments, the induction was less 
than twofold (range 1 .6- to 1 .9-fold) 

51. We thank H. Bennett, P. Speflman, J. Ravetto, M. 
Eisen, R. Pillai, B. Dunn, T. Ferea. and other mem- 
bers of the Brown lab for their assistance and helpful 
advice. We also thank S. Friend, D. Botstein, S. 
Smith, J. Hudson, and D. Dolginow for advice, sup- 
port, and encouragement: K Struhl and S. Chatter- 
jee for the Tup1 deletion strain; L. Femandes for 
helpful advice on Yap1 ; and S. Klapholz and the 
reviewers for many helpful comments on the manu- 
script. Supported by a grant from the National Hu- 
man Genome Research Institute (NHGRI) 
(HG00450), and by the Howard Hughes Medical In- 
stitute (HHMl). J.D.R. was supported by the HHMl 
and the NHGRI. V.R. was supported in part by an 
Institutional Training Grant in Genome Science (T32 
HG00044) from the NHGRI. P.O.B. is an associate 
investigator of the HHMl. 

5 September 1 997; accepted 22 September 1997 



686 



SCIENCE • VOL 278 • 24 OCTOBER 1997 * www.sciencemag.org 



